[SPARK-16311][SQL] Metadata refresh should work on temporary views#14009
[SPARK-16311][SQL] Metadata refresh should work on temporary views#14009rxin wants to merge 9 commits intoapache:masterfrom
Conversation
[SPARK-16311][SQL] Improve metadata refresh
| */ | ||
| def invalidateTable(name: TableIdentifier): Unit = { /* no-op */ } | ||
| def refreshTable(name: TableIdentifier): Unit = { | ||
| // Go through temporary tables and invalidate them. |
There was a problem hiding this comment.
In the test case of HiveMetadataCacheSuite.scala, users might refresh the base table by using spark.catalog.refreshTable("view_table"). Normally, they do not specify the current database name. Then, its database name is empty. Thus, this table will be treated as a temporary table. This comment might need a correction.
|
LGTM except a minor comment. |
|
Test build #61599 has finished for PR 14009 at commit
|
|
Test build #61600 has finished for PR 14009 at commit
|
| val newCount = sql("select count(*) from view_refresh").first().getLong(0) | ||
| assert(newCount > 0 && newCount < 100) | ||
| } | ||
| }} |
There was a problem hiding this comment.
This style is pretty weird...
|
LGTM except for a minor styling issue. |
|
Thanks - I fixed the two comments. Going to merge it in master/2.0. |
## What changes were proposed in this pull request? This patch fixes the bug that the refresh command does not work on temporary views. This patch is based on #13989, but removes the public Dataset.refresh() API as well as improved test coverage. Note that I actually think the public refresh() API is very useful. We can in the future implement it by also invalidating the lazy vals in QueryExecution (or alternatively just create a new QueryExecution). ## How was this patch tested? Re-enabled a previously ignored test, and added a new test suite for Hive testing behavior of temporary views against MetastoreRelation. Author: Reynold Xin <rxin@databricks.com> Author: petermaxlee <petermaxlee@gmail.com> Closes #14009 from rxin/SPARK-16311. (cherry picked from commit 16a2a7d) Signed-off-by: Reynold Xin <rxin@databricks.com>
|
Test build #61773 has finished for PR 14009 at commit
|
|
Hi, @rxin . |
|
Thank you for fast fix! |
What changes were proposed in this pull request?
This patch fixes the bug that the refresh command does not work on temporary views. This patch is based on #13989, but removes the public Dataset.refresh() API as well as improved test coverage.
Note that I actually think the public refresh() API is very useful. We can in the future implement it by also invalidating the lazy vals in QueryExecution (or alternatively just create a new QueryExecution).
How was this patch tested?
Re-enabled a previously ignored test, and added a new test suite for Hive testing behavior of temporary views against MetastoreRelation.